Fast software multiplication in F2[x] for embedded processors
نویسنده
چکیده
We present a novel method for fast multiplication of polynomials over 2 which can be implemented efficiently in embedded software. Fast polynomial multiplication methods are needed for the efficient implementation of some cryptographic and coding applications. The proposed method follows a strategy to reduce the memory accesses for input data and intermediate values during computation. This strategy speeds up the binary polynomial multiplication significantly on typical embedded processors with limited memory bandwidth. These multiplications are usually performed by the comb method or the Karatsuba-based methods in embedded software. The proposed method has speed and memory advantages over these methods on embedded platforms for the polynomial degrees usually encountered in practical cryptosystems. We perform a detailed complexity analysis of the proposed method and complexity comparisons with the other methods. Finally, we present the running times of the proposed method and its alternatives on ARM7TDMI processor.
منابع مشابه
Optimizing Matrix-matrix Multiplication for an Embedded Vliw Processor
The optimization of matrix-matrix multiplication (MMM) performance has been well studied on conventional general-purpose processors like the Intel Pentium 4. Fast algorithms, such as those in the Goto and ATLAS BLAS libraries, exploit common microarchitectural features including superscalar execution and the cache and TLB hierarchy to achieve near-peak performance. However, the microarchitectur...
متن کاملFast Software Polynomial Multiplication on ARM Processors Using the NEON Engine
Efficient algorithms for binary field operations are required in several cryptographic operations such as digital signatures over binary elliptic curves and encryption. The main performance-critical operation in these fields is the multiplication, since most processors do not support instructions to carry out a polynomial multiplication. In this paper we describe a novel software multiplier for...
متن کاملASIC Design of Butterfly Unit Based on Non-Redundant and Redundant Algorithm
Fast Fourier Transform (FFT) processors employed with pipeline architecture consist of series of Processing Elements (PE) or Butterfly Units (BU). BU or PE of FFT performs multiplication and addition on complex numbers. This paper proposes a single BU to compute radix-2, 8 point FFT in the time domain as well as frequency domain by replacing a series of PEs. This BU comprises of fused floating ...
متن کاملImproved Lower Bounds for Testing Triangle-freeness in Boolean Functions via Fast Matrix Multiplication
Understanding the query complexity for testing linear-invariant properties has been a central open problem in the study of algebraic property testing. Triangle-freeness in Boolean functions is a simple property whose testing complexity is unknown. Three Boolean functions f1, f2 and f3 : F2 → {0, 1} are said to be triangle free if there is no x, y ∈ F2 such that f1(x) = f2(y) = f3(x + y) = 1. Th...
متن کاملMatrix-Matrix Multiplications and Fault Tolerance on Hypercube Multiprocessors
Several new algorithms for matrix-matrix multiplications on hypercube multiprocessors are presented and evaluated based on the number of multiplications, additions, and transfers. The matrices ~I be multiplied are uniformly distributed to all processors of a hypercube system. Each processor owns some submatrices which are derived by dividing the source matrices. Each submatrix multiplication ca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012